Menu:

  1. Dataset Characteristics

  2. Analysis Steps

  3. Analysis Summary

  4. Contributor

Dataset Characteristics

Citation:

Baltimore Police Department, (2016). Calls for Service [Calls_for_Service.csv]. Retrieved from https://data.baltimorecity.gov/Public-Safety/Calls-for-Service/xviu-ezkt, 10/10/2016.

License:

This data set is under the Creative Commons Attribution 3.0 Unported License. Under this license, the users can use, share, and modify the data without restraint, and must maintain the same for others as well. Additionally, the user must give appropriate credit to the dataset owner, and list any changes that were made from the original dataset, and include a link to the license. License Link: https://creativecommons.org/licenses/by/3.0/legalcode

Metadata:

This data set contains a list of all the 911 emergency calls made in the city of Baltimore from the year 2015.

The dataset contains no metadata, other than what is provided in the column names. We used our best judgement to find what column names represent.

Here are the exact column names, and our interpretation of them:

  1. callDateTime (Date and time when the call was made)
  2. priority (Priority of the call, ex. Low, High)
  3. district (District codes from where the calls were made)
  4. description (Description of the call)
  5. callNumber (Unique reference ID for each call)
  6. incidentLocation (Street from where the call originated)
  7. location (latitude and longitude coordinates from where the call originated)

Issues encountered with the data

  1. There are missing values for a few columns, namely ‘location’ (latitude/longitude) and ‘priority’.
  2. The ‘callDateTime’ column values were not standardized (ex. Some values were in 24 hour formats, some values were in AM/PM format).
  3. The values for the ‘callNumber’ column were not possible to identify conclusively. (ex. ‘P152310750’).
  4. The ‘callNumber’ column, (unique reference ID) skips numbers in the original data set.
  5. Many of the values in the ‘description’ column were meaningless (ex. ‘' and ’56’)
  6. Many of the values in the ‘description’ column were misspelled (ex. ‘BURGLARPY’)
  7. The values for the ‘district’ column were vague (ex. ‘SW’, ‘NE’)

We did not remediate all the data issues we found, only the data issues that pertained to the questions we wanted to answer from this data set. We did not want to unnecessarily modify the data set.

Analysis steps

1. How we obtained the data

We downloaded the Baltimore Emergency Calls data set from https://data.baltimorecity.gov/Public-Safety/Calls-for-Service/xviu-ezkt

The original data set consists of almost 2 million records. We downloaded the data in CSV format, then imported the data into R.

We decided to use a random sample of 50,000 rows for analysis of the first and second question. One team member took a random sample, and distributed the same random sample to the rest of the team, so that we would all have the exact same sample.

Step by step instructions:

##Load the data into R
mydata=read.csv('Final Subset Data CSV.csv')

##Take a random sample of 50,000 rows
subsetcalls <- mydata[sample(1:nrow(mydata), 50000, replace=FALSE),]

##Convert to Excel format, so that the random sample can be distributed to the team members
##install xlsx package
library(xlsx)
write.xlsx(x = subsetcalls.dataframe, file = "test.excelfile.xlsx",sheetName = "TestSheet", ##row.names = FALSE)
Index CallDateTime Priority District Description CallNumber IncidentLocation Location
44699 0:49:00 High ED & 99 P153060067 HARFORD RD/CLIFTVIEW AV (39.315795,-76.596304)
11175 11:39:00 Low SE **SERV OF A WAR 150541007 1800 ORLEANS ST (39.2961670,-76.5919375)
475 2:04:00 Medium NE * 150040299 5000 PLYMOUTH RD (39.3514250,-76.5742210)
42673 16:06:00 Medium WD *` 150131865 0 N FULTON AV (39.2882849,-76.6452158)
23032 1:08:00 High ED *2 MALE 150460163 1000 E PRESTON ST (39.3052050,-76.6035010)

2. Data Cleaning and Analysis Script for the first question

Question 1: Are more 911 calls related to burglary made during the night? Or is that just a preconceived notion?

To answer this I first need to perform data cleaning on the ‘Description’ and ‘CallDateTime’ fields

Data Cleaning

##---CallDateTime field---
## 1. This question needs the time and not the date, so remove the dates using excel
##      Click on the column header so that the entire column and all its rows are highlighted.
##      Right click and select 'Format Cells'. 
##      In the pop up that opens, go to the 'Number tab', and select the Category as 'Time'.
##      Select the time format '13:30:55' and click on 'Ok'. Now the date values should disappear

## 2. Read data into R

subsetcalls=read.csv('Final Subset Data CSV.csv')

## 3. Convert into 'Time' format
## Install chron package
library(chron)
subsetcalls$CallDateTime=chron(times=subsetcalls$CallDateTime)

##---Description field---
## 1. Look through all the distinct 'Description' values and identify the ones related to robberies
## Install and load sqldf - so that sql commands can be used in R
allrobbery<-sqldf('SELECT DISTINCT subsetcalls.Description FROM subsetcalls')
allrobbery<-sqldf('SELECT DISTINCT subsetcalls.Description, COUNT(Description) FROM subsetcalls WHERE Description in ("ATTEMP BURGLARY", "BURGLARPY", "Burglary", "BURGLARY", "POSS BURGLARY", "POSS STOLEN", "Robbery Armed", "ROBBERY ARMED", "Robbery Unarmed", "ROBBERY UNARMED", "ATTEMPT THEFT") GROUP BY Description')


## 2. Modify all the 'Description' values related to robberies to read "BURGLARY"
subsetcalls$Description[subsetcalls$Description=="ATTEMP BURGLARY"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="BURGLARPY"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="Burglary"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="POSS BURGLARY"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="POSS STOLEN"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="Robbery Armed"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="ROBBERY ARMED"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="Robbery Unarmed"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="ROBBERY UNARMED"]<-"BURGLARY"
subsetcalls$Description[subsetcalls$Description=="ATTEMPT THEFT"]<-"BURGLARY"

## 3. The final modified subset for this particular question, after data cleaning
finalburglary<-sqldf('SELECT subsetcalls.Description, subsetcalls.CallDateTime FROM subsetcalls WHERE Description = "BURGLARY" order by subsetcalls.CallDateTime')

To perform analysis, I need to categorize the times as Morning, Afternoon, Evening, Night

## 1. Create a new column called 'Time_Of_Day', and enter the default value as 'Night'
finalburglary["Time_Of_Day"]<- 'Night'

## 2. Label the hours from 5AM-11AM as Morning, 11AM-5PM as Daytime, and 5PM-11PM as Evening
finalburglary$Time_Of_Day[finalburglary$CallDateTime > '05:00:00' & finalburglary$CallDateTime <= '11:00:00'] <- 'AMMorning'
finalburglary$Time_Of_Day[finalburglary$CallDateTime > '11:00:00' & finalburglary$CallDateTime <= '17:00:00'] <- 'Daytime'
finalburglary$Time_Of_Day[finalburglary$CallDateTime > '17:00:00' & finalburglary$CallDateTime <= '23:00:00'] <- 'Evening'

## I have entered Morning as "AMMorning" because the 4 categories are displayed by alphabetical order in the plot, and Morning needs to appear first. I have changed the category label name in the plot.

Plot the data

## Obtain the counts of each category in the form of a table, this will help with plotting the data
burglarycounts<- table(finalburglary$Time_Of_Day)

## Used a barplot. Set the X and Y axis labels, the column names, and the colors.
barplot(burglarycounts, main = "Burglaries Count by Time of Day", xlab = "Time of Day", ylab = "Number of Burglaries", ylim = c(0, 600), col = c("lightseagreen", "brown1", "chartreuse3", "darkorchid1"), names.arg=c("Morning", "Day Time", "Evening", "Night"))

## Display the burglary count and timing for each category
text(0.7, 100, "5AM-11AM")
text(1.9, 100, "11AM-5PM")
text(3.1, 100, "5PM-11PM")
text(4.3, 100, "11PM-5AM")
text(0.7, 270, "234")
text(1.9, 470, "428")
text(3.1, 470, "432")
text(4.3, 270, "228")

3. Data Cleaning and Analysis Script for the second question

Question 2: Do the Number of blank calls increase or decrease with the arrival of the weekend?

To answer this I first need to perform data cleaning on the ‘Description’ and ‘CallDateTime’ fields

Data Cleaning

#---Standardize the Date column in excel---
#      Click on the column header so that the entire column and all its rows are highlighted.
#      Right click and select 'Format Cells'. 
#      Change all the values to the format 'mm/dd/yyyy hh:mm'

# Load the data into R
subsetcalls = read.csv("Final Subset Data Original CSV 1.csv")

#---Description field---

#selecting all distinct types of call descriptions and storing into a table
distinctDescriptionValues <- sqldf('SELECT DISTINCT subsetcalls.Description FROM subsetcalls
                                   WHERE Description like "%Hangup%"')

#selecting only various types of no voice calls and its corressponding count and storing it into a table
allblankcalls=sqldf('SELECT DISTINCT subsetcalls.Description, COUNT(Description)
                    from subsetcalls WHERE Description in
                    ("911/NO  VOICE","911/No Voice")
                    GROUP BY Description')

#selecting only various types of hangup calls and its corressponding count and storing it into a table
allblankcalls1=sqldf('SELECT DISTINCT subsetcalls.Description, COUNT(Description)
                     from subsetcalls WHERE Description in
                     ("911/HANGUP","911/HANGUP.","911 HANGUP")
                     GROUP BY Description')

#converting the column description from factor to character
subsetcalls$Description=as.character(subsetcalls$Description)

#replaceing all forms of no voice with a single NO_VOICE
subsetcalls$Description[subsetcalls$Description=="911/NO  VOICE"]="NO_VOICE"
subsetcalls$Description[subsetcalls$Description=="911/No Voice"]="NO_VOICE"

#replacing all forms of hangups with a single HANGUP
subsetcalls$Description[subsetcalls$Description=="911/HANGUP"]="HANGUP"
subsetcalls$Description[subsetcalls$Description=="911/HANGUP."]="HANGUP"
subsetcalls$Description[subsetcalls$Description=="911 HANGUP"]="HANGUP"

#converting the description column back to factor
subsetcalls$Description=as.factor(subsetcalls$Description)

#checking the converted values
allblankcalls=sqldf('SELECT DISTINCT subsetcalls.Description, COUNT(Description)
                    from subsetcalls WHERE Description in
                    ("NO_VOICE")
                    GROUP BY Description')

allblankcalls1=sqldf('SELECT DISTINCT subsetcalls.Description, COUNT(Description)
                     from subsetcalls WHERE Description in
                     ("HANGUP")
                     GROUP BY Description')
allblankcalls
##   Description COUNT(Description)
## 1    NO_VOICE               9411
allblankcalls1
##   Description COUNT(Description)
## 1      HANGUP                 85
#---CallDateTime field---
#using library timeDate. Install timeDate package
library(timeDate)

#removed dates from calldate column and put it into dates column
subsetcalls$dates=format(as.POSIXct(subsetcalls$CallDateTime,format='%m/%d/%Y'),format='%m/%d/%Y')

#converted dates column into date format (was in factor format)
subsetcalls$dates=as.timeDate(subsetcalls$dates)

#adding a column called days that converts all the dates into days
subsetcalls$days=dayOfWeek(subsetcalls$dates)

#based on days column - creating the weekend column that makes all the weekeends (SAT & SUN) into "1"s
subsetcalls$weekend=isWeekend(subsetcalls$dates)

#removed time from calldate column and put it into time column
subsetcalls$time=format(as.POSIXct(subsetcalls$CallDateTime, format="%m/%d/%Y %H:%M"), format="%H:%M:%S")

#use chron library to convert into time format

#this converts the time into times format and puts it into time2 column
subsetcalls$time2<- chron(times=subsetcalls$time)

#this adds an extra column that gives the valus TRUE for friday also as part of a weekend
subsetcalls$weekend1=isWeekend(subsetcalls$dates,1:4)

#This makes a subset of all values that corresspond to the weekend 
#with an extra condition that friday and over 7pm on are also included
abc1=subset(subsetcalls, subsetcalls$weekend == "TRUE" | (subsetcalls$days=="Fri" & subsetcalls$time2 >= "19:00:00"))
#this makes a subset of all the values with weekday upto friday 7pm
abc2=subset(subsetcalls,subsetcalls$weekend1 =="FALSE" | (subsetcalls$days=="Fri" & subsetcalls$time2 < "19:00:00"))

#WITH FRIDAY AS PART OF WEEKEND
#this is the dataset of weekend with friday and NO_VOICE
weekend_1=subset(abc1,abc1$Description=="NO_VOICE")
#this is the dataset of weekend with friday and HANGUP
weekend_2=subset(abc1,abc1$Description=="HANGUP")

#this is the dataset of NO_VOICE in weekdays (friday excluded)
weekday_1=subset(abc2,abc2$Description=="NO_VOICE")
#this is the dataset of HANGUP in weekdays (friday excluded)
weekday_2=subset(abc2,abc2$Description=="HANGUP")



#WITHOUT FRIDAY AS PART OF WEEKEND
#this is the dataset of weekend without friday and NO_VOICE
weekend_3=subset(subsetcalls, subsetcalls$weekend == "TRUE" & subsetcalls$Description == "NO_VOICE")
#this is the dataset of weekend without friday and HANGUP
weekend_4=subset(subsetcalls, subsetcalls$weekend == "TRUE" & subsetcalls$Description == "HANGUP")

#this is the dataset of NO_VOICE in weekdays (friday included)
weekday_3=subset(subsetcalls, subsetcalls$weekend == "FALSE" & subsetcalls$Description == "NO_VOICE")
#this is the dataset of HANGUP in weekdays (friday included)
weekday_4=subset(subsetcalls, subsetcalls$weekend == "FALSE" & subsetcalls$Description == "HANGUP")

Plot the data

#Plotting Graphs
#bar plots
#converting days column from character to factor
subsetcalls$days=as.factor(subsetcalls$days)

#forming a subset for overall number of no_voice and hangup calls
subsetcalls1=subset(subsetcalls, subsetcalls$Description=="NO_VOICE")
subsetcalls2=subset(subsetcalls, subsetcalls$Description=="HANGUP")

#printing out the summary of the days column - this gives us the number of calls for each day
summary(subsetcalls1$days)
##  Fri  Mon  Sat  Sun  Thu  Tue  Wed 
## 1398 1365 1287 1276 1316 1328 1441
summary(subsetcalls2$days)
## Fri Mon Sat Sun Thu Tue Wed 
##  15   5  18  10  14  13  10
#saving the number of no_voice and hangup calls into vectors
no_voice=c(1365,1328,1441,1316,1398,1287,1276)
hangup=c(15,5,18,10,14,13,10)

#plotting number of no_voice calls daywise
barplot(no_voice, main="NO VOICE CALLS", beside=TRUE, 
        col=terrain.colors(7),ylim = c(0,1600),xlab = "DAYS - Monday through Sunday",ylab = "Number of NO_VOICE Calls",col.lab="Blue",col.main="Red")

#plotting the number of hangup calls daywise
barplot(hangup, main="HANGUP CALLS", beside=TRUE, 
        col=terrain.colors(5),ylim = c(0,25),xlab = "DAYS - Monday through Sunday",ylab = "Number of Calls",col.lab="Blue",col.main="Red")


#ggplot
#using library ggplot, install the ggplot2 package
library(ggplot2)

#reading the file consolidates that has all the values for number of calls during the weekend/weekday
q=read.csv("consolidated.csv")

#ggplot
ggplot(q,aes(x=variable,y=count,fill=factor(index)))+
  geom_bar(stat="identity",position="dodge")+
  scale_fill_discrete(name="Legend",
                      breaks=c(1, 2,3,4),
                      labels=c("Weekend(Fri Included)", "Weekday(Fri excluded)","Weekend(Friday Excluded)","Weekday(Friday Included)"))+
  xlab("Type of Calls")+ylab("Number of Calls")

4. Data Cleaning and Analysis Script for the third question

Question 3: Find the top 5 dangerous reasons for calls (e.g. Aggravated assault). Plot them based on the locations.

  1. From which location (street) did most of these calls originate? Can we find the most dangerous locality in Baltimore based on the 911 calls?

  2. Is there a pattern for each of these calls with respect to their proximity to the coast? Do these calls increase in frequency when you get closer to the coast?

Data Cleaning

# Downloaded a sample of our dataset having callDateTime values starting from 01 January 2016 upto 10th Dec 2016
# Removed all rows having location values as (,) and (0.000000,0.000000) by filtering the data in Excel
# Split the location column values into two columns Latitude and Longitude in Excel using the 'Text to Columns' option

# Importing the dataset obtained after performing the above cleaning operations
subsetcalls = read.csv("Calls_for_Service.csv")

#cleaning on description column
#importing library sqldf - so that sql commands can be used in R
library(sqldf)

#selecting all distinct types of call descriptions and storing into a table
distinctDescriptionValues<-sqldf('select distinct description,count(description) from subsetcalls group by description')

# Selected the top dangerous reasons for 911 calls as 'Aggravated Assault','Destruction of property','Discharge Firearm/Shooting' and 'Robbery'
# Grouped similar categories having considerable count of rows and gave them a common label in Excel. Used the filter option for selecting descriptions 
# For each selected reason, I found different variations of the same description label (different cases and spellings) aand created a common label

# AGGRAV ASSAULT,Aggrav Assault ->  Aggravated Assault  
# DESTRUCT PROP,DESTRUCT PROPTY,Destruct Propty -> Destruction of property  
# DISCHRG FIREARM,Dischrg Firearm,SHOOTING,Shooting -> Discharge Firearm & Shooting
# ROBBERY ARMED,Robbery Armed,ROBBERY UNARMED,Robbery Unarmed -> Robbery    


# Load the cleaend Excel file having only the records belongng to the four descriptions mentioned above
dangerouscalls = read.csv("dangerous_calls.csv")

Plot the data

# Creating a map of Baltimore and plotting the locations of the most dangerous 911 calls as shortlisted above
# get_map is a smart wrapper that queries the Google Maps, OpenStreetMap, Stamen Maps or Naver Map servers for a map. 
# It returns a ggmap object. 
# Source attribute tells us the type of map to be used. By defaut, it uses Google map. 
# geom_point is used to plot points on the map. It is generally used to create scatter plots
# scale_color_manual is used to map color values
# ggtitle For giving a title to the plot
library(ggmap)
ggmap(get_map(location = "baltimore",zoom=12)) + 
geom_point(data=dangerouscalls,aes(x=dangerouscalls$Longitude,y=dangerouscalls$Latitude,colour=description)) + 
scale_color_manual(values=c("red","blue","orange","black")) + ggtitle("Top dangerous reasons for 911 calls in Baltimore County")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=baltimore&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=baltimore&sensor=false

# Creating dataframes for each description and plotting them individually on the map

# Discharge Firearm & Shooting
dischgfirearm = subset(dangerouscalls,dangerouscalls$description=="Discharge Firearm & Shooting")
ggmap(get_map(location = "baltimore",zoom=12)) + geom_point(data=dischgfirearm,aes(x=dischgfirearm$Longitude,y=dischgfirearm$Latitude),color="orange") + ggtitle("Calls related to Discharge Firearm/Shooting in 2016")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=baltimore&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=baltimore&sensor=false

# Aggravated Assault
aggassault = subset(dangerouscalls,dangerouscalls$description=="Aggravated Assault")
ggmap(get_map(location = "baltimore",zoom=12)) + geom_point(data=aggassault,aes(x=aggassault$Longitude,y=aggassault$Latitude),color="red") + ggtitle("Calls related to Aggravated Assaults in 2016")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=baltimore&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=baltimore&sensor=false

# Destruction of property
destructprop = subset(dangerouscalls,dangerouscalls$description=="Destruction of property")
ggmap(get_map(location = "baltimore",zoom=12)) + geom_point(data=destructprop,aes(x=destructprop$Longitude,y=destructprop$Latitude),color="blue") + ggtitle("Calls related to Destruction of property in 2016")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=baltimore&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=baltimore&sensor=false

# Robbery
robbery = subset(dangerouscalls,dangerouscalls$description=="Robbery")
ggmap(get_map(location = "baltimore",zoom=12)) + geom_point(data=robbery,aes(x=robbery$Longitude,y=robbery$Latitude),color="black") + ggtitle("Calls related to Robbery in 2016")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=baltimore&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=baltimore&sensor=false

# This lists down the 4 dangerous calls descriptions with their count
summary(dangerouscalls$district)
##   CD   CW   ED EVT2   ND   NE   NW   SD   SE   SS   SW  TRU   WD 
## 3408   22 3126    1 3100 4627 3128 3889 3760   11 3336  814 2922
# Creating a vector for the districts count
districts=c(4627,3889,3760,3408,3336,3128,3126,3100,2922)

# Plotting a bar plot which shows the districtwise distribution of dangerous 911 calls
barplot(districts, main="Districtwise distribution of 911 calls related to dangerous crimes", beside=TRUE,
col=terrain.colors(9),ylim = c(0,5500),xlab = "NE          SD          SE          CD          SW          NW          ED          ND          WD",
ylab = "Number of 911 Calls",col.lab="Blue",col.main="Red")

Analysis Summary

This data set contains a list of all the 911 emergency calls made in the city of Baltimore from the year 2015, and is updated every day. This data is gathered by ‘Open Baltimore’ and the city of Baltimore, and contains almost 2 million records!

The data contains information such as the time of call, latitude/longitude, district, description, and priority. We downloaded the data, and used Excel and R to analyze the data and produce plots. We came up with 3 questions that could possibly be answered using the data.

Question 1: Are more 911 calls related to burglary made during the night? Or is that just a preconceived notion?

The target audience here is Baltimore’s police department (PD). This analysis could help the police department decide if they need to be more alert for burglaries and increase their patrol duty during the night. After processing the data, we were able to get a count of all the burglary related calls for each time of the day: morning, afternoon, evening, and night.

This plot shows that the number of burglary related calls for the afternoon and evening is almost double the number of calls for the morning and night. The graph suggests that the idea that more burglaries occur at night is a preconceived notion. The graph also suggests that more burglaries occur during the afternoon and evening hours (11AM-11PM) when people are outside and their homes are empty. Burglars target houses when they are empty, rather than when it is dark outside. Hence, the PD need not increase its patrol duty during the night time. This analysis does have limitations. This data only reports 911 calls that were made about burglaries, it does not include the regular theft reports directly made to the police department.

Question 2: Whether the number of blank calls increase/decrease with the arrival of the weekend? To answer this question, I need to use the columns ‘callDateTime’ and ‘description’. The target audience for this analysis is Baltimore PD. If there are more blank calls during the weekend, the PD can infer that most of them are prank calls made by partygoers.

The graph shows that the number of blank calls does not change across the week. Hence the PD cannot make assumptions about blank calls made during the weekend based on this analysis.

Question 3: Find the top 5 dangerous reasons for calls. Plot them based on the locations.

Can we find the most dangerous locality in Baltimore based on the 911 calls?

Is there a pattern for the calls with the proximity to the coast?

The target audience for this analysis are prospective home buyers in Baltimore County. They could avoid buying a house in the most dangerous locality.

## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=baltimore&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=baltimore&sensor=false
## Warning: Removed 203 rows containing missing values (geom_point).

According to the graph, the center of Baltimore city has the most emergency calls, the farther away from the city, the less the number of calls. Furthermore, there is no pattern of calls with respect to the coast. Hence prospective home buyers can look for homes away from the center of the city, and need not consider proximity to the coast.

Contributorship

Our project has three different research questions. We divided the questions amongst us so that each one can concentrate on analyzing and answering one question. In addition to working on our own questions, we divided the task of preparing each of the three draft documents by aggregating information from the other two team mmembers. Divya prepared the Data Cleaning Draft and the first question. Akash prepared the R Implementation Draft and the third question. Samrjith prepared the R Plot Draft and the second question. For the final Git package, all of us worked on our individual components. Divya aggregated the components and prepared the final zip package for uploading to Git. Akash upoaded the zip packge to Git. Akash and Samarjith worked on the presentation with inputs from Divya.